Encoding semantic relationships in literary texts
نویسندگان
چکیده
Encoding meaningful semantic relationships in literary texts is almost as difficult defining and identifying them. Defining the types components of that can be extracted from a quite challenging task because literature full implicit oblique messages references. Subsequently, encoding even more often relations do not have neither clear nor standard linguistic form usually they overlap each other. This paper discusses modeling issues concerning mapping cultural content humanities texts, highlighted by case ECARLE project annotation campaign. On handling these proposes methodology minimalistic flexible techniques, combined order to generate human annotated training data for Relation Extraction machine learning system. The proposed utilizes available TEI tagset, and, without any further customizations, allows formed named entities simple yet way, open reuse, interchange, conversion visualization.
منابع مشابه
Annotating Character Relationships in Literary Texts
We present a dataset of manually annotated relationships between characters in literary texts, in order to support the training and evaluation of automatic methods for relation type prediction in this domain (Makazhanov et al., 2014; Kokkinakis, 2013) and the broader computational analysis of literary character (Elson et al., 2010; Bamman et al., 2014; Vala et al., 2015; Flekova and Gurevych, 2...
متن کاملLiterary Figures in Gāthic Texts
Introduction Gāthic texts are a collection of religious songs of Zarothustra who lived about 1200 BC. Of the seventy two hāts (stanzas) of Yasna (one of the five chapters of Avesta), seventeen hāts belong to five Gāthas. These seventeen hāts have been classified into five categories based on their syllabic meter and the number of the song: 1) ahunavaiti, 2) ushtavaiti, 3)spanta.mainyu, ...
متن کاملAnnotating Similes in Literary Texts
Annotated corpora are invaluable resources for researchers in the humanities: on the one hand, for natural processing tasks, they can serve as standards against which results from new automatic methods can be measured; on the other hand, in corpus-based studies, they enable either to answer existing research questions or to explore original ones. In this respect, some annotation frameworks such...
متن کاملIdentifying Literary Texts with Bigrams
We study perceptions of literariness in a set of contemporary Dutch novels. Experiments with machine learning models show that it is possible to automatically distinguish novels that are seen as highly literary from those that are seen as less literary, using surprisingly simple textual features. The most discriminating features of our classification model indicate that genre might be a confoun...
متن کاملDiscovering Multilingual Text Reuse in Literary Texts
We present here a method for automatically discovering several classes of text reuse across different languages, from the most similar (translations) to the most oblique (literary allusions). Allusions are an important subclass of reuse because they involve the appropriation of isolated words and phrases within otherwise unrelated sentences, so that traditional methods of identifying reuse incl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Balisage Series on Markup Technologies
سال: 2021
ISSN: ['1947-2609']
DOI: https://doi.org/10.4242/balisagevol26.koidaki01